Cascade Markov random fields for stroke extraction of Chinese characters
نویسندگان
چکیده
Extracting perceptually meaningful strokes plays an essential role in modeling structures of handwritten Chinese characters for accurate character recognition. This paper proposes a cascade Markov random field (MRF) model that combines Preprint submitted to Elsevier 29 September 2009 both bottom-up (BU) and top-down (TD) processes for stroke extraction. In the lowlevel stroke segmentation process, we use a BUMRF model with smoothness prior to segment the character skeleton into directional substrokes based on self-organization of pixel-based directional features. In the high-level stroke extraction process, the segmented substrokes are sent to a TD MRF-based character model that, in turn, feeds back to guide the merging of corresponding substrokes to produce reliable candidate strokes for character recognition. The merit of the cascade MRF model is due to its ability to encode the local statistical dependencies of neighboring stroke components as well as prior knowledge of Chinese character structures. Encouraging stroke extraction and character recognition results confirm the effectiveness of our method, which integrates both BU/TD vision processing streams within the unified MRF framework.
منابع مشابه
Hidden Markov Random Field Based Approach for Off-Line Handwritten Chinese Character Recognition
This paper presents a Hidden Markov Mesh Random Field (HMMRF) based approach for off-line handwritten Chinese characters recognition using statistical observation sequences embedded in the strokes of a character. Due to a large set of Chinese characters and many different writing styles, the recognition of handwritten Chinese characters is very challenging. In our approach, the binary image is ...
متن کاملTransliteration Extraction from Classical Chinese Buddhist Literature Using Conditional Random Fields
Extracting plausible transliterations from historical literature is a key issues in historical linguistics and other resaech fields. In Chinese historical literature, the characters used to transliterate the same loanword may vary because of different translation eras or different Chinese language preferences among translators. To assist historical linguiatics and digial humanity researchers, t...
متن کاملCombination of Machine Learning Methods for Optimum Chinese Word Segmentation
This article presents our recent work for participation in the Second International Chinese Word Segmentation Bakeoff. Our system performs two procedures: Out-ofvocabulary extraction and word segmentation. We compose three out-of-vocabulary extraction modules: Character-based tagging with different classifiers – maximum entropy, support vector machines, and conditional random fields. We also co...
متن کاملA Run-Length Coding Based Approach to Stroke Extraction of Chinese Characters
Traditional stroke extraction approach usually adopts thinning technique as the preprocessing method in obtaining the skeletons of Chinese characters. However, thinning may produce spurious branches and multiple fork points at junctions. Such distortion will make stroke extraction process more complicate and unreliable. This paper proposes a novel run-length-based stroke extraction approach wit...
متن کاملA Model of Stroke Extraction from Chinese Character Images
Given the large number and complexity of Chinese characters, pattern matching based on structural decomposition and analysis is believed to be necessary and essential to off-line character recognition. This paper proposes a new model of stroke extraction for Chinese characters. One problem for stroke extraction is how to extract primary strokes. Another major problem is to solve the segmentatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Sci.
دوره 180 شماره
صفحات -
تاریخ انتشار 2010